---
title: "Reference: MDocument | Document Processing | RAG | Kastrax Docs"
description: Documentation for the MDocument class in Kastrax, which handles document processing and chunking.
---

# MDocument ✅

The MDocument class processes documents for RAG applications. The main methods are `.chunk()` and `.extractMetadata()`.

## Constructor ✅

<PropertiesTable
  content={[
    {
      name: "docs",
      type: "Array<{ text: string, metadata?: Record<string, any> }>",
      description: "Array of document chunks with their text content and optional metadata",
    },
    {
      name: "type",
      type: "'text' | 'html' | 'markdown' | 'json' | 'latex'",
      description: "Type of document content",
    }
  ]}
/>

## Static Methods ✅

### fromText()

Creates a document from plain text content.

```typescript
static fromText(text: string, metadata?: Record<string, any>): MDocument
```

### fromHTML()

Creates a document from HTML content.

```typescript
static fromHTML(html: string, metadata?: Record<string, any>): MDocument
```

### fromMarkdown() 

Creates a document from Markdown content.

```typescript
static fromMarkdown(markdown: string, metadata?: Record<string, any>): MDocument
```

### fromJSON()

Creates a document from JSON content.

```typescript
static fromJSON(json: string, metadata?: Record<string, any>): MDocument
```

## Instance Methods ✅

### chunk()

Splits document into chunks and optionally extracts metadata.

```typescript
async chunk(params?: ChunkParams): Promise<Chunk[]>
```

See [chunk() reference](./chunk) for detailed options.

### getDocs()

Returns array of processed document chunks.

```typescript
getDocs(): Chunk[]
```

### getText()

Returns array of text strings from chunks.

```typescript
getText(): string[]
```

### getMetadata()

Returns array of metadata objects from chunks.

```typescript
getMetadata(): Record<string, any>[]
```

### extractMetadata()

Extracts metadata using specified extractors. See [ExtractParams reference](./extract-params) for details.

```typescript
async extractMetadata(params: ExtractParams): Promise<MDocument>
```

## Examples ✅

```typescript
import { MDocument } from '@kastrax/rag';

// Create document from text
const doc = MDocument.fromText('Your content here');

// Split into chunks with metadata extraction
const chunks = await doc.chunk({
  strategy: 'markdown',
  headers: [['#', 'title'], ['##', 'section']],
  extract: {
    summary: true, // Extract summaries with default settings
    keywords: true  // Extract keywords with default settings
  }
});

// Get processed chunks
const docs = doc.getDocs();
const texts = doc.getText();
const metadata = doc.getMetadata();
```